Utilizing hardware transactional approach to execute code after initially utilizing software locking by employing pseudo-transactions

ABSTRACT

Utilizing a hardware transactional approach to execute a code section by employing pseudo-transactions, after initially utilizing software locking, is disclosed. A method is disclosed that utilizes a software approach to locking memory to execute a code section relating to memory. The software approach employs a pseudo-transaction to determine whether a hardware approach to transactional memory to execute the threshold would have been successful. Where the hardware approach to transactional memory to execute the code section satisfies a threshold based on success of at least the pseudo-transaction, the method subsequently utilizes the hardware approach to execute the code section. The hardware approach may include starting a transaction inclusive of the code section, conditionally executing the transaction, and, upon successfully completing the transaction, committing execution of the transaction to the memory to which the code section relates.

RELATED APPLICATIONS

The present patent application is a continuation of the previously filedpatent application entitled “Utilizing hardware transactional approachto execute code after initially using software locking by employingpseudo-transactions,” filed on Sep. 12, 2003, and assigned Ser. No.10/661,017.

BACKGROUND OF THE INVENTION

1. Technical Field

This invention relates generally to executing a section of code on anall-or-nothing basis, such that the entire section of code is executedand committed to memory, or none of the section of code is executed andcommitted to memory. The invention relates more particularly to softwarelocking approaches and hardware transactional approaches to suchexecution of code on an all-or-nothing basis.

2. Description of the Prior Art

In multiple-processor computing systems, more than one processor mayattempt to affect the same memory at the same time. For instance, anumber of transactions, which may be read or write requests or responsesto resources such as memory, may vie for the same memory at the sametime. If each transaction is allowed unfettered access to the samememory, the results can include corrupting the integrity of the datastored in this memory. For example, one transaction may read a givenmemory line, act upon the value read, and then write a new value to thememory line. While the transaction is acting upon the value it read fromthe memory line, another transaction may write a different value to thememory line. When the first transaction writes its new value to thememory line, the second transaction may not realize that its value hasbeen overwritten.

One approach to ensuring that a number of transactions are notattempting to process the same memory at the same time is to use asoftware locking approach. In a software locking approach, a transactionmust first successfully obtain a lock on the relevant lines of memorybefore it is able to process the data stored in these memory lines. Iftwo transactions are attempting to process the same memory line, thenone transaction will initially win the lock, and be able to process thememory line before the second transaction does. Thus, the transactionsare implicitly serialized, so that they do not try to compete for thesame memory line at the same time. A disadvantage to using the softwarelocking approach is that it can add overhead to the processing oftransactions that in most cases is unnecessary, since most of the timethere will be no contention for desired memory lines. This can causedegradation in performance of the entire system.

Another approach to ensuring that a number of transactions are notattempting to process the same memory at the same time is to use ahardware transactional memory approach. In a hardware transactionalmemory approach, the hardware of a system, specifically its processors,have the ability to process sections of code as transactional memory.Transactional memory can thus be considered as a way to bracket a codesection such that it becomes a large, multi-argument load link/storeconditional (LL/SC) transaction. The code section is executedspeculatively, and the decision to commit the changes is deferred untilthe end of the section of code. If there has been any interference withany of the data used by the code section, such as the memory lines,cache lines, and so on, being used by the code section, then the entiretransaction is aborted. Otherwise, the entire transaction is committedto memory, and the changes memory to the relevant memory and cacheslines are effected.

While the hardware transactional memory approach is faster inperformance than the software locking approach, it nevertheless suffersfrom some disadvantages. For the hardware transactional memory approachto work, the operations performed by the relevant section of code areaccomplished within a cache before being committed to memory. However,if the cache is not large enough, or does not have great enoughassociativity, then the approach will fail. This is because the entiresection of code will not be able to be completely executed speculativelybefore the processing effects of the code section are committed tomemory. That is, the hardware transactional memory approach, whileadvantageous in performance as compared to the software lockingapproach, is not as widespread in its potential application as is thesoftware locking approach. For these and other reasons, therefore, thereis a need for the present invention.

SUMMARY OF THE INVENTION

The invention relates to utilizing a hardware transactional approach toexecute code, after initially utilizing software locking, by employingpseudo-transactions. A method of the invention includes utilizing asoftware approach to locking memory to execute a code section relatingto memory, and employing a pseudo-transaction to determine whether ahardware approach to execute the threshold would have been successful.Where the hardware approach satisfies a threshold based on success of atleast the pseudo-transaction, the hardware approach is subsequentlyutilized to execute the code section.

A system of the invention includes a processor having transactionalmemory capability, and memory. The transactional memory capability ofthe processor includes a pseudo-transactional memory capability thatdetermines whether the transactional memory capability would have beensuccessful. The memory stores a spin lock function to execute a codesection by utilizing the transactional memory capability upon thetransactional memory capability having satisfied a threshold based uponsuccess of at least the pseudo-transactional memory capability.

An article of manufacture includes a computer-readable medium and meansin the medium. The means in the medium is for utilizing a hardwareapproach to transactional memory to execute a code section after havingutilized a software approach to locking memory to execute the codesection and the hardware approach having satisfied a threshold based atleast upon a pseudo-transaction to determine whether the hardwareapproach would have succeeded in executing the code section. Otherfeatures and advantages of the invention will become apparent from thefollowing detailed description of the presently preferred embodiment ofthe invention, taken in conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The drawings referenced herein form a part of the specification.Features shown in the drawing are meant as illustrative of only someembodiments of the invention, and not of all embodiments of theinvention, unless otherwise explicitly indicated, and implications tothe contrary are otherwise not to be made.

FIG. 1 is a flowchart of a method according to a preferred embodiment ofthe invention, and is suggested for printing on the first page of thepatent.

FIG. 2 is a diagram of a system having a number of nodes, in conjunctionwith which embodiments of the invention may be implemented.

FIG. 3 is a diagram of one of the nodes of the system of FIG. 2 in moredetail, according to an embodiment of the invention.

FIG. 4 is a flowchart of a method for executing a section of codeaccording to a hardware approach to transactional memory, according toan embodiment of the invention.

FIG. 5 is a flowchart of a method for executing a section of codeaccording to a software approach to locking memory, according to anembodiment of the invention.

DESCRIPTION OF THE PREFERRED EMBODIMENT Overview and Method

FIG. 1 shows a method 100, according to a preferred embodiment of theinvention. Like other methods of embodiments of the invention, themethod 100 may be implemented as a computer-readable medium on anarticle of manufacture. The medium may be a recordable data storagemedium, such as a magnetic, semiconductor, and/or optical medium, aremovable or a fixed medium, and/or a volatile or a non-volatile medium.The medium may also be a modulated carrier signal. The method 100 may beperformed by a processor of a node of a multi-node system that is toexecute a section of code that relates to memory of the node.

A hardware approach to transactional memory is initially used to executea section of code on an all-or-nothing basis (102). That is, thehardware approach to transactional memory is utilized such that eitherthe entire section of code is executed and committed to memory, or noneof the section of code is executed and committed to memory. The hardwareapproach to transactional memory thus treats the section of code as asingle transaction. It conditionally executes the code section,committing execution of the code section to memory only if the entirecode section can be completed. The hardware approach to transactionalmemory is a hardware approach in that it is accomplished in hardware,such as by the transactional memory capability of the processor that isperforming the method 100.

If the hardware approach does not fail a threshold in executing the codesection (104), then the next time the code section needs to be executed,the hardware approach to transactional memory is again employed toexecute the section of code (102). In one embodiment, the hardwareapproach fails the threshold if it is forced to abort execution of thecode section a single time. That is, the hardware approach fails thethreshold if it fails to completely execute the code section a singletime. In another embodiment, the hardware approach fails the thresholdif it is forced to abort execution of the code section a predeterminednumber of times. Abortion of code section execution may be caused whenanother code section is attempting to read from and/or write to the samememory that the first code section is processing, for instance. Otherapproaches to determine whether the hardware approach has failed thethreshold are described in a later section of the detailed description.

If the hardware approach fails the threshold in executing the codesection (104), then a software approach to locking memory is insteadutilized to execute the section of code (106). The software approach isutilized by first locking the memory to which the code section relates.The code section is then executed, and is committed to memory as it isexecuted. No other sections of code can read from and/or write to thesame memory to which the code section relates, because the code sectionhas placed a lock on the memory. When the code section has finishedbeing executed, the lock on the memory that it was accessing isreleased, or removed. The software approach to locking memory may beimplemented by a spin-lock function that is called prior to executingthe section of code, and a spin-unlock function that is called afterexecuting the section of code, as is described in more detail in a latersection of the detailed description.

Preferably, after the software approach has been utilized to execute thesection of code, if the hardware approach to transactional memory hasagain satisfied the threshold (108), then the hardware approach isutilized the next time the code section needs to be executed (102). Aswill be described in more detail in a later section of the detaileddescription, this can be implemented in one embodiment by having apseudo-transaction executed, or performed, concurrently with thesoftware approach in 106. A pseudo-transaction is similar to an actualhardware transaction employed by the hardware approach to transactionalmemory, but unconditionally performs the instructions in the codesection, and unconditionally commits execution of the code section tomemory. A pseudo-transaction never aborts, but rather determines whetheran actual transaction would have been successful in execution. That is,a pseudo-transaction is employed to determine whether utilizing thehardware approach to transactional memory would have been successful inexecuting the code section. Thus, a pseudo-transaction can be employedto determine whether the hardware approach to transactional memory hasagain satisfied the threshold in 108. Furthermore, determining whetherthe hardware approach to transactional memory has satisfied thethreshold can be based upon the success of previous pseudo-transactionsand/or previous transactions.

However, if the hardware approach has not satisfied the threshold (108),then the software approach is utilized the next time the code sectionneeds to be executed (106). In this way, the software approach is afallback approach to executing the section of code where the hardwareapproach is the default and preferred approach to executing the sectionof code. This may be because the hardware approach provides for improvedsystem performance as compared to utilizing the software approach, forinstance.

System and Code Section Execution

FIG. 2 shows a system 200 in accordance with which embodiments of theinvention may be implemented. The system 200 includes a number of nodes202A, 202B, 202C, and 202D, which are collectively referred to as thenodes 202. The nodes 202 are connected with one another through aninterconnection network, or interconnect, 204. Each of the nodes 202 mayinclude at least one processor and memory. Where the system 200 is anon-uniform memory architecture (NUMA) system, the memory of a givennode is local to the processors of the node, and is remote to theprocessors of the other nodes. However, the system 200 may be anothertype of system in lieu of being a NUMA system.

FIG. 3 shows in more detail a node 300, according to an embodiment ofthe invention, that can implement one or more of the nodes 202 of FIG.2. As can be appreciated by those of ordinary skill within the art, onlythose components needed to implement one embodiment of the invention areshown in FIG. 3, and the node 300 may include other components as well.The node 300 includes a processor 302 and a memory 304. There may beother processors within the node 300 besides the processor 302. Thememory 304 may be or include random-access memory (RAM), as well asother types of memory, such as non-volatile memory, read-only memory(ROM), and so on.

The processor 302 includes transactional memory capability 306, which isused to effect the hardware transactional approach to executing codesections, as has been described. Alternatively, the transactional memorycapability 306 may be a part of hardware other than the processor 302.The transactional memory capability 306 may in one embodiment includepseudo-transactional memory capability as well, such that it can bedetermined whether the hardware transactional approach to executing codesections would have been successful, even where the hardwaretransactional approach is nevertheless not currently employed for codesection execution.

The memory 308 includes a code section 308, data 310, a spin lockfunction 312, and a spin unlock function 314. The code section 308 is asection of code that is preferably executed on an all-or-nothing basis.That is, either the entirety of the code section 308 is executed andcommitted to memory, or none of the code section 308 is executed andcommitted to memory. The data 310 is the part of the memory 304 to whichthe code section 308 relates. That is, the data 310 is the data that isprocessed by the code section 308.

The spin lock function 312 and the spin unlock function 314 effect thesoftware approach to locking and unlocking memory that has beendescribed. Particularly, the spin lock function 312 is called to lockthe memory, such as the data 310, for the code section 308 to beexecuted without interruption or corruption of the data 310. The spinunlock function 314 is then called to unlock the memory after the codesection 308 has been executed. That is, the unlock function 314 iscalled to remove, or release, the lock on the data 310 after the codesection 308 has been executed. As is described in more detail in a latersection of the detailed description, the spin lock and unlock functions312 and 314 may default to utilization of the transactional memorycapability 306 of the processor 302 to execute the code section 308, andutilize their software locking capability as a fallback approach forexecuting the code section 308.

FIG. 4 shows a method 400 for using a hardware approach to transactionalmemory to execute a section of code, according to an embodiment of theinvention. For instance, the method 400 may be that which is performedby the transactional memory capability 306 to execute the code section308. First, a transaction inclusive of the relevant section of code isstarted (402). The transaction is conditionally executed (404). Forinstance, results of the conditional execution of the transaction may betemporarily stored in a processor cache or other type of cache. If thetransaction has successfully completed (406), then execution of thetransaction is committed to memory (408), such that the entire sectionof code has been executed. Otherwise, the transaction is aborted (410),and none of the section of code is effectively executed in actuality.

FIG. 5 shows a method 500 for using a software approach to lockingmemory to execute a section of code, according to an embodiment of theinvention. For instance, the method 500 may be that which is performedby the spin lock and unlock functions 312 and 314 to execute the codesection 308. First, a lock is placed on the memory to which a relevantsection of code relates (502). This is the memory that is to beprocessed by the section of code, such as the data 310 of the memory304. The lock prevents other sections of code, for instance, fromprocessing the memory while the relevant section of code is processingthe memory. The code section is then executed (504), such that executionof the code section is committed to memory as it is executed (506). Thatis, the code section is not executed on a conditional basis. Since thememory to which the code section relates is locked, the code section maybe committed to memory as it is executed. Finally, the lock on thememory to which the code section relates is removed, or released (508),so that other code sections, for instance, may process the memory.

Particular Embodiment and Pseudo-Code

A particular embodiment of the spin lock function 312 and the spinunlock function 314 is now described, in relation to pseudo-code thatimplements both of these functions. The functions 312 and 314 arespecifically described as implementing both the software approach tolocking memory and the hardware approach to transactional memory thathave been described. The spin lock function 312 is called to lock therelevant memory for a section of code to be executed, be it by thehardware or the software approach. The spin unlock function 314 is thencalled to release the lock from the memory after the section of code hasbeen executed.

First, a number of memory-transaction primitives are described that areutilized in the pseudo-code. The primitives include begin_txn( ),begin_txn_check( ), commit_txn( ), and abort_txn( ). The primitivebegin_txn( ) may be of type integer, and marks the start of a hardwaretransaction. It returns true. If a given transaction is then aborted bythe hardware, execution resumes after the corresponding begin_txn( ),which returns false. This can be implemented in one embodiment with aninstruction that takes a branch address for the abort path, so long asthat instruction restores registers in the event of an abort. In anotherembodiment, software can save and restore the registers, but thisapproach may impose undesired added overhead on the system.

The primitive begin_txn_check( ) marks the start of apseudo-transaction. A pseudo-transaction does not affect instructionexecution, except to track whether a real transaction would have beensuccessful. The pseudo-code uses this primitive to determine when it isacceptable to switch back from a software locking approach to a hardwaretransactional approach. Although not included in the pseudo-code, anadditional primitive of type int, end_txn_check( ), may be provided tomark the end of a pseudo-transaction, returning true if a realtransaction would have succeeded. However, this primitive is not needed,as described in the next paragraph, and thus is not included in thepseudo-code.

The primitive commit_txn( ) may be of type integer, and marks the end ofa transaction. All memory writes that were speculatively executed sincethe matching begin_txn( ) are made permanent, and visible to otherprocessors. This primitive also ends the effect of a matchingbegin_txn_check( ) primitive, returning true if a real transaction wouldhave succeeded. Thus, the primitive end_txn_check( ) described in thepreceding paragraph is not needed in all embodiments of the invention.

Finally, the primitive abort_txn( ) has a parameter mimic_hw of typeinteger. This primitive aborts the current transaction. If mimic_hw istrue, then execution resumes with the matching begin_txn( ) returningfalse. Otherwise, execution continues after the abort_txn( ). It is notpermissible to pass true to an abort_txn( ) that matches abegin_txn_check( ). In one embodiment, it may be useful to have theprimitive begin_txn_check( ) return a true or false value so thatabort_txn( ) can mimic a hardware abort, even for a pseudo-transaction.46 The pseudo-code is line-numbered alphanumerically for descriptiveconvenience. The pseudo-code additionally is an example of asoftware-codified implementation of the method 100, as can beappreciated by those of ordinary skill within the art. Three initialdefinitions are first provided:

-   A1 typedef atomic_t txn_lock;-   A2 #define TXN_LOCK_HELD 0x80000000-   A3 #define TXN_LOCK_DOLOCK 0x40000000-   A4 #define TXN_LOCK_OWNER 0x3fffffff    Line A1 defines the type txn_lock as an atomic operation. Lines A2,    A3, and A4 define the constants TXN_LOCK_HELD, TXN_LOCK_DOLOCK, and    TXN_LOCK_OWNER. The constant TXN_LOCK_HELD refers to the scenario    where a software lock is currently being held, whereas the constant    TXN_LOCK_DOLOCK refers to the scenario where a software locking    approach, in lieu of a hardware transactional approach, is to be    utilized. The constant TXN_LOCK_OWNER defines a bit field into which    an identifier for the processor or thread holding the lock is    placed.

The spin lock function 312 is then provided as: B1 spin_lock(txn_lock*tp) B2 { B3 int oldval; B4 int newval;

The spin_lock function receives in line B1 as an argument a pointer *tpto a variable of type txn_lock. The variables oldval and newval aredeclared in lines B3 and B4, and used internally by the spin_lockfunction to read values from atomic reads on the variable tp. B5 for(;;) { B6 oldval = atomic_read(tp); B7 if (oldval & TXN_LOCK_DOLOCK) {B8 while ((oldval = atomic_read(tp)) & ˜TXN_LOCK_OWNER) B9 ==TXN_LOCK_DOLOCK | TXN_LOCK_HELD) { B10 continue; B11 }

The spin_lock function first atomically reads the variable tp as thevariable oldval in the line B6. The “if” clause of lines B7-B11 isexecuted if the variable tp indicates that a software lock should beemployed. The while loop of lines B8-B11 is executed to constantly loopwhile the variable tp, which is read as the variable oldval in line B8,continues to show that a software lock should be used, and that thesoftware lock is in fact being held. B12 if (oldval != TXN_LOCK_DOLOCK){ B13 continue; B14 }

Next, if the variable oldval indicates that a software lock should notbe utilized, in line B12, then in line B13 the result of the continuefunction causes the spin_lock function to reexecute, beginning at lineB5. B15 newval = oldval | TXN_LOCK_HELD; B16 if (cmpxchg(&tp, oldval,newval) != oldval) { B17 continue; B18 }

The variable newval is set equal to the variable oldval, and logicallyOR'ed with the constant TXN_LOCK_HELD, in line B15 to indicate that thesoftware lock is held.

The compare and exchange function is used in the “if” clause in line B16to determine whether the variable oldval has now changed relative to thevariable newval. If so, then this means that some other processor orthread modified the lock value, so that the attempted update fails, andthe continue function in line B17 causes the spin_lock function toreexecute, beginning at line B5.

-   B19 begin_txn_check( );

The begin_txn_check( ) function is called in line B19 to flag thebeginning of a pseudo-atomic section. The hardware will determinewhether an atomic transaction equivalent to the lock's critical sectionwould have succeeded, and report that via the commit_txn function inspin_unlock, as will be described. B20 } else { B21 if (!begin_txn( )) {B22 oldval = atomic_read(tp); B23 if ((oldval & TXN_LOCK_DOLOCK) == 0) {B24 newval = oldval | TXN_LOCK_DOLOCK; B25 (void)cmpxchg(tp, oldval,newval); B26 } B27 continue; B28 }

The “if” clause in line B21 begins a hardware transaction, returning anon-zero result. If this transaction is later aborted, control willreturn to this begin_txn, which will then return a zero result. Thus,lines B22 through B27 are executed only when a hardware transaction isaborted. In this instance, if the variable oldval does not indicate thata software lock should be held, in line B23, then the variable newval isset equal to the variable oldval and logically OR'ed with the constantTXN_LOCK_DOLOCK, in line B24, to indicate that the software lock shouldnow be used in preference to hardware transactions when executing thecode section in question. The compare and exchange function is used inline B25 to attempt to set the variable tp to the variable newval, andthe continue function in line B27 causes the spin_lock function toreexecute, beginning at line B5. B29 oldval = atomic_read(tp); B30 if(oldval & TXN_LOCK_DOLOCK) { B31 abort_txn(FALSE); B32 continue; B33 }B34 } B35 } B36 }

Finally, the variable oldval again is set equal to the variable tp, asatomically read in line B29. If the variable oldval indicates that asoftware lock should be held in line B30, then the hardware approach totransactional memory is aborted in line B31, and the spin_lock functionreexecutes, beginning at line B5, due to the continue function in lineB32.

The spin unlock function 314 is provided as: C1 spin_unlock(txn_lock*tp) C2 { C3 int newval; C4 int nextval; C5 int oldval; C6 int result;The spin_unlock function receives in line C1 as an argument a pointer*tp to a variable of type txn_lock. The variables newval, nextval, andoldval, and declared in lines C3-C5, and are used internally by thespin_unlock function to hold values from atomic reads on the variable tpand to compute new values to be stored into variable tp via the cmpxchgfunction. The variable result is used to store the results fromattempting to commit the transaction encompassing the code section inquestion by utilizing a hardware approach to transactional memory.

-   C7 result=commit_txn( );

The function commit_txn( ) is called in line C7, the results of whichthe variable result is set equal to, to commit execution of the sectionof code in question when using a hardware approach to transactionalmemory. If the software locking approach was instead used, the functioncommit_txn( ) instead indicates whether the hardware approach would havesucceeded had it been used. C8 if (((oldval = atomic_read(tp)) &(TXN_LOCK_HELD | C9 TXN_LOCK_OWNER)) == (TXN_LOCK_HELD | me( ))) { C10if (result) { C11 newval = 0; C12 } else { C13 newval = TXN_LOCK_DOLOCK;C14 }

In line C8, the variable oldval is set equal to the atomically readvalue of the variable tp. The “if” clause in lines C8 and C9 determineswhether the variable oldval indicates that a software lock is being heldby this processor or thread, where the me( ) function returns a uniqueidentifier for the currently running processor, process, or thread, Ifthe “if” clause yields true, then lines C10-C13 are performed. If theresult of the commit_txn( ) operation in line C7 yielded a true result,indicating that the transaction could have been successfully committedto memory using the hardware approach, as tested in line C10, then thevariable newval is set equal to zero in line C11. Setting the variablenewval to zero will then be used to indicate that a software lock shouldnot be later employed. Otherwise, if the variable result yielded a falseresult, as tested in line C10, then this indicates that the transactionwas unsuccessfully committed to memory using the hardware approach, andin line C13 the variable newval is set to the constant TXN_LOCK_DOLOCK,to indicate that a software lock should be subsequently employed. C14while ((nextval = C15 cmpxchg(tp, oldval, newval)) != oldval) { C16oldval=nextval; C17 } C18 } C19 }

The variable nextval is set to the result of the compare and exchangefunction in lines C14 and 15. If the variable nextval is not equal tothe variable oldval, then the variable oldval is set equal to thevariable nextval in line C16, and the while loop of lines C14-C17 isrepeated until the variable nextval is equal to the variable oldval.That is, the while loop of lines C14-C17 is employed to effectuate thevariable newval as had been set in line C10 or line C12, within thevariable tp.

The pseudo-code that has been described utilizes both actual hardwaretransactions, via the hardware transactional approach, as well aspseudo-transactions. The pseudo-transactions are employed to determinewhether the hardware transactional approach would have been successful,so that the hardware transactional approach can be switched back to fromthe software locking approach. However, in another embodiment, onceutilization of the hardware transactional approach has yielded to use ofthe software locking approach, the hardware transactional approach isnever again utilized. That is, the software locking approach neverswitches back to the hardware transactional approach. In thisembodiment, pseudo-transactions, and the correspondingpseudo-transaction primitives, are not needed and are not used.

Furthermore, in another embodiment, pseudo-transactions and theircorresponding primitives may not be present, but the ability to switchback from use of the software locking approach to the hardwaretransactional approach may nevertheless be provided. For example, thepseudo-code may instead randomly select between real hardwaretransactions and software locking, weighted by historical transactionsuccess and failure statistics. Such an approach, as well as otherapproaches, thus allow for the use of the hardware transactionalapproach even after the software locking approach has been employed, andwhere pseudo-transactional capability is not provided.

Alternative Embodiments

The pseudo-code listed and described in the previous section of thedetailed description uses a simple threshold to determine whether thehardware approach to transactional memory should yield to the softwareapproach to locking memory in executing the section of code in question.Specifically, in line B21, the hardware approach to transactional memoryfails the threshold where it has aborted. That is, the hardware approachto transactional memory fails the threshold where it has abortedexecution of the code section a single time.

Similarly, the pseudo-code uses the same simple threshold to determinewhether the software approach to locking memory should yield back to thehardware approach to transaction memory in execution the code section inquestion. Specifically, in line C9, the hardware approach totransactional memory satisfies the threshold where it would not haveaborted, when executing the section of code. That is, the hardwareapproach satisfies the threshold where it has, or would have,successfully committed the transaction encompassing the code section.

However, in alternative embodiments of the invention, more sophisticatedthresholds are employed to determine whether the software approach tolocking memory should be used in lieu of the hardware approach totransactional memory, and vice-versa, in executing a section of code.One such alternative embodiment has already been described, where thehardware approach has to fail to execute the code section, or abort thecode section, a predetermined number of times greater than one beforethe software approach is employed. Likewise, the hardware approach wouldhave had to successfully execute the code section the predeterminednumber of times before it is again actually used in lieu of the softwareapproach.

In one embodiment, a digital filter is used to maintain state within thelock. A digital filter slows the response of a system where the inputschange too quickly. For instance, utilization of the software approachto locking memory may cause the state to increase by a fraction, andutilization of the hardware approach to locking memory may cause thestate to decrease by the fraction, where the state can vary between zeroand one. If the state is greater than a given threshold, such asone-half, then the software approach is utilized, whereas if it is lessthan the threshold, then the hardware approach is utilized.

In another embodiment, the compiler may pass information to thespin_lock( ) and spin_unlock( ) functions of the pseudo-code provided inthe previous section of the detailed description. For instance, thecompiler may determine a score based on the notion of transfer functionsknown to those of ordinary skill within the art. That is, the scorerealizes the expected number of memory references in the expectedcritical parts of the section of code in which the code section causesthe transaction to abort, such as the number of references to distinctcache lines within the section of code. A transfer function is generatedbased on this number. Compilers that have full awareness of the hardwarestructures, such as cache size, associativity, and other transactionallimitations, may be able to provide better estimates of the likelihoodof hardware transactional success. The spin_unlock( ) function may bemore aggressive in clearing the need for software locking wheretransactions are more likely to succeed. Information from the hardwareof the system, such as the processor thereof, is thus passed to thespin_lock( ) and spin_unlock( ) functions through the compiler.

In another embodiment, the success rates of utilizing the hardwaretransactional approach are tracked. However, the act of tracking thesuccess rate may cause transactions encompassing the code sections to beexecuted more likely to fail. Therefore, the spin_lock( ) functionshould record its identity so that the spin_unlock( ) function cancommunicate the measurements made. This may be accomplished within amachine register, bearing in mind that there may be many-to-manyrelationships between spin_lock( ) and spin_unlock( ) functionalprimitives.

In another embodiment, a per-lock caller state is maintained, which iscomparable to branch-prediction tables in processors, as can beappreciated by those of ordinary skill within the art. The same lock mayoften be used for multiple critical parts of a code section that cancause transaction abortion and that have differing cache requirements.The spin_lock( ) function may record its address in the lock whenacquiring the lock, and the spin_unlock( ) function may measure thetransaction-completion success rate on a per-spin_lock( ) basis. Thespin_lock( ) function can then more aggressively use transactions onsections of code where there have been good records of success.

In another embodiment, the number of times that a given section of codehas transactionally failed is counted, such that the spin_lock( )function is more likely to use software locking in cases where therehave been multiple failures, even if the failures are not sequential.Furthermore, queued software locks or non-uniform memory-architecture(NUMA) software locks, as known to those of ordinary skill within theart, can be particularly used in differing embodiments of the invention.Reader-writer software locks, as known to those of ordinary skill withinthe art, may also be used in an alternative embodiment of the invention.

The pseudo-code described in the previous section of the detaileddescription is particularly useful where the software locks in questionare perfectly nested. However, where the software locks are imperfectlynested, such as is the case with hierarchical locks, alternativeapproaches may be considered. First, the enclosing transaction may beaborted when a hierarchical lock is encountered. Alternatively, thehardware transaction application-programming interface (API) may bemodified to accept the address of the lock, permitting the hardware tomatch the hierarchical transactions. In addition, a software check maybe performed to determine if an enclosing transaction is currently beingexecuted, such that the inner locks use the software approach in lieu ofthe hardware approach.

Advantages Over the Prior Art

Embodiments of the invention allow for advantages over the prior art.Whereas utilizing a hardware approach to transactional memory to executecode sections can be advantageous from a performance perspective,embodiments of the invention nevertheless fall back on a slower softwareapproach to execute the sections of code where the hardware approachfails, or aborts, too often. The embodiments of the invention thusensure that the hardware approach is utilized where appropriate, suchthat the performance gains of utilization of the hardware approach aremaintained. The embodiments also ensure that the software approach isutilized where the hardware approach is not appropriate, so that overallforward progress of sectional code execution continues and does not hangon an overly aborting hardware approach.

Conclusion

It will be appreciated that, although specific embodiments of theinvention have been described herein for purposes of illustration,various modifications may be made without departing from the spirit andscope of the invention. For instance, the system that has been describedas amenable to implementations of embodiments of the invention has beenindicated as having a non-uniform memory access (NUMA) architecture.However, the invention is amenable to implementation in conjunction withsystems having other architectures as well. Accordingly, the scope ofprotection of this invention is limited only by the following claims andtheir equivalents.

1. An article of manufacture comprising: a computer-readable medium;and, means in the medium for utilizing a hardware approach totransactional memory to execute a code section after having utilized asoftware approach to locking memory to execute the code section and thehardware approach to transactional memory having satisfied a thresholdbased at least upon a pseudo-transaction to determine whether thehardware approach would have succeeded in executing the code section. 2.The article of claim 1, wherein the means utilizes the hardware approachto transactional memory where the hardware approach to transactionalmemory would have successfully executed the code section a predeterminedone or more times.
 3. The article of claim 1, wherein the hardwareapproach satisfies the threshold also based on previous transactionsutilized by the hardware approach to execute the code section and onprevious pseudo-transactions.
 4. The article of claim 1, wherein thecomputer-readable medium is one of a recordable data storage medium anda modulated carrier signal.